On the Communication Complexity of Distributed Set-Joins
نویسندگان
چکیده
Given a set-comparison predicateP and given two lists of setsA = (A1, . . . , Am) and B = (B1, . . . , Bm), with all Ai, Bj ⊆ [n], the P-set join A ./P B is defined to be the set {(i, j) ∈ [n] × [n] | P(Ai, Bj)} ([n] denotes {1, 2, . . . , n}). When P(Ai, Bj) is the condition “Ai ∩ Bj 6= ∅” we call this the set-intersection-notempty join (a.k.a. the composition ofA and B); whenP(Ai, Bj) is “Ai∩Bj = ∅” we call it the set-disjointness join; when P(Ai, Bj) is “Ai = Bj” we call it the set-equality join; when P(Ai, Bj) is “|Ai ∩ Bj | ≥ T ” for a given threshold T , we call it the setintersection threshold join. Assuming A and B are stored at two different sites in a distributed environment, we study the (randomized) communication complexity of computing these, and related, set-joins A ./P B, as well as the (randomized) communication complexity of computing the exact and approximate value of their size k = |A ./P B|. Combined, our analyses shed new insights into the quantitative differences between these different set-joins. Furthermore, given the close affinity of the natural join and the setintersection-not-empty join, our results also yield communication complexity results for computing the natural join in a distributed
منابع مشابه
Quantum Communication Complexity of Distributed Set Joins
Computing set joins of two inputs is a common task in database theory. Recently, Van Gucht, Williams, Woodruff and Zhang [PODS 2015] considered the complexity of such problems in the natural model of (classical) two-party communication complexity and obtained tight bounds for the complexity of several important distributed set joins. In this paper we initiate the study of the quantum communicat...
متن کاملA Survey on Complexity of Integrity Parameter
Many graph theoretical parameters have been used to describe the vulnerability of communication networks, including toughness, binding number, rate of disruption, neighbor-connectivity, integrity, mean integrity, edgeconnectivity vector, l-connectivity and tenacity. In this paper we discuss Integrity and its properties in vulnerability calculation. The integrity of a graph G, I(G), is defined t...
متن کاملH2RDF+: High-performance distributed joins over large-scale RDF graphs
The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work...
متن کاملLattice Completion Algorithms for Distributed Computations
A distributed computation is usually modeled as a finite partially ordered set (poset) of events. Many operations on this poset require computing meets and joins of subsets of events. The lattice of normal cuts of a poset is the smallest lattice that embeds the poset such that all meets and joins are defined. In this paper, we propose new algorithms to construct or enumerate the lattice of norm...
متن کاملComplexity and approximation ratio of semitotal domination in graphs
A set $S subseteq V(G)$ is a semitotal dominating set of a graph $G$ if it is a dominating set of $G$ andevery vertex in $S$ is within distance 2 of another vertex of $S$. Thesemitotal domination number $gamma_{t2}(G)$ is the minimumcardinality of a semitotal dominating set of $G$.We show that the semitotal domination problem isAPX-complete for bounded-degree graphs, and the semitotal dominatio...
متن کامل